home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
ShareWare OnLine 2
/
ShareWare OnLine Volume 2 (CMS Software)(1993).iso
/
util2
/
rss30.zip
/
RSS.DOC
< prev
next >
Wrap
Text File
|
1993-04-22
|
25KB
|
541 lines
╔══════════════════════════════════════════════════════════════════════════╗
║ ║
║ ISoft D&M ║
║ POB. 5517 ║
║ Coralville IA 52241 ║
║ U.S.A ║
║ ║
╚══════════════════════════════════════════════════════════════════════════╝
*******************************************************************************
* RSS.DOC-V3.0 *
* LAST UPDATE - Apr. 22, 1993. (c) 1991, 93 Ron Loewy. *
*******************************************************************************
░░░░░░░░░░░░░ ░░░░░░░░░░░ ░░░░░░░░░░░
░▒▒▒▒▒▒▒▒▒▒▒▒░ ░░▒▒▒▒▒▒▒▒▒▒▒ ░░▒▒▒▒▒▒▒▒▒▒▒
░▒▓▓▓▓▓▓▓▓▓▓▓░▒ ░░▒▒▓▓▓▓▓▓▓▓▓▓▓ ░░▒▒▓▓▓▓▓▓▓▓▓▓▓
░▒▓██████████░▒▓ ░░▒▒▓▓███████████ ░░▒▒▓▓███████████
░▒▓█ on's ░▒▓█ ░░▓▓██ mart ░░▓▓██ earc
░░░░░░░░░░░░░ ▒▓█ ░░░░░░░░░░ ░░░░░░░░░░h
░▒▒▒▒▒▒▒░░▒▒▒▒ ▓█ ▒▒▒▒▒▒▒▒░░ ▒▒▒▒▒▒▒▒░░
░▒▓▓▓▓▓▓▓░░▓▓▓▓ █ ▓▓▓▓▓▓▓▓░░ ▓▓▓▓▓▓▓▓░░
░▒▓███████░░████ ██████░░▒▒ ██████░░▒▒
░▒▓█ ░░ ░░▒▒▓▓ ░░▒▒▓▓
░▒▓█ ░░ ░░░░░░░░░░░▒▒▓▓██ ░░░░░░░░░░░▒▒▓▓██
▒▓█ ▒▒ ▒▒▒▒▒▒▒▒▒▒▒▓▓██ ▒▒▒▒▒▒▒▒▒▒▒▓▓██
▓█ ▓▓ ▓▓▓▓▓▓▓▓▓▓▓██ ▓▓▓▓▓▓▓▓▓▓▓██
█ ██ ███████████ ███████████
*******************************************************************************
* File List *
*******************************************************************************
This package contains the following files :
RSS.EXE - RSS program executable.
RSS.DOC - This file.
RSS.REG - Registration file.
DEMORSS.BDF - Demo Build File for Ralf Brown's Interruprt List.
PROGRAMS.TXT - ISoft D&M shareware products description.
*******************************************************************************
* Why Register *
*******************************************************************************
RSS is a shareware product, if you find this product valuable,
please register it. This section describes the reasones you should register.
By registering you will receive a diskette with the latest RSS version,
and the DPMI (Protected Mode) version of RSS. The DPMI version of RSS
can open dictionaries that have up to 16 segments (VS. 8 only for the
shareware version - that has memory limitations, All this and - you will
help us create the next version of RSS - that will include even more
features then the features that are currently available!, we might even
add YOUR enhancment requests!
*******************************************************************************
* Whats New *
*******************************************************************************
V3.0 - New multiple size dictionaries - The BIG/SMALL models were replaced
by 1 to 8 (or 16 in the registered dpmi version) dictionaries
that work on more uniform entry distribution.
- RSS Will NOT read/manipulate V2.x configuration files, RSS
dictionaries will have to be re-built.
- Because of the above changes, RSS build file DICTIONARY command
was changed to be : DICTIONARY x, where x is the number of
dictionary segments that will be used.
- Added a demo RSS build file and query tool to query Ralf brown's
interrupt list. Please refer to the TUTORIAL section of this file.
- RSS will be able to process more than 8 files during the build
process from this version.
V2.2 - Added the ? help command. From this version RSS is distributed by
ISOFT D&M, P.O.B 5517, Coralville IA 52241, U.S.A
V2.1 - Video configuration is restored better when RSS terminates.
V2.0 - Added the FIND command that displays the text source of the
found key, the SEARCH command "finds" the same keys, but
only displays the key and file name, the find command displays
the actual text.
PLEASE NOTICE - Dictionaries built in previous releases of RSS
are NOT COMPATIBLE with RSS V2.0 and above! . Be sure to RE-BUILD
all of your dictionaries before you start using them with RSS V2.0.
I'm sorry for the inconvenience this restriction might cause, but
the new FIND feature is so important that the dictionary structure
change is worthy.
V1.2 - Command re-direction from the DOS prompt is now available.
- Build file definition prefix on command line changed to
'$' from '^', The '^' character is used by 4DOS as a command
delimiter.
V1.1 - RSS has a new information screen when building the indexes, the old
version supplied all the keys with no sum information. Now RSS shows
the sum information of the keys found in each file.
*******************************************************************************
* Introduction *
*******************************************************************************
RSS is an extention of my own TXS text search program, RSS uses the same
technology and search logic to perform FAST LOGIC searches on static text
databases. The difference is that RSS has a more complicated database
build facilities with enhanced creation parameters, and the ability to
hold huge databases. RSS is a command line program where TXS is a full
CUA environment. As RSS is a superset of TXS, I might integrate the two
products by enhancing future versions of TXS and RSS to work in dual mode -
"simple" mode - current TXS database definition, and "enhanced" mode -
current RSS database definition. As a user of this product I would like
to hear from you about these matters.
The need for RSS was created when I found out I spend a lot of time searching
a problem index of a company I give technical support and consulting to, the
index arrives once a month, and is about 3 MB large. In the index we have
about 500 - 2000 different problem descriptions with search keywords. The
problems are seperated by a dashed line, and are uniqely identified by a
problem key number that appears on the first line after the dashed seperator
line, with the prefix PROB#=. I found myself hoping for the ability to perform
smart searchs with logic operators, the way I do on our telefax database with
my own TXS and XCD programs.
Over the years I found out that I use RSS whenever I want to search Ralf
Brown's excelent interrupt list, and the instructions on doing just that
are given in then Tutorial section of this document.
RSS was built to be generic enough to answer the problem I encountered, and
provide a similiar solution to as many cases as possible. RSS allows the user
to define a multi-key, multi-file database, with variable length "entries"
(or records), build an economic existential dictionary on it, and use it as
an index for fast logic searches. RSS allows the definition of common words
that are belived to exist in all of the entries, and are thought to be left
out of the dictionary, by defining "exclude" dictionary. RSS supports 16
dictionary "models", the smallest model supports dictionaries of up to
30,000 words, and the biggest model supports up to 500,000 words.
If you need a version of RSS that will be able to use more than 500,000
words, please contact the author or the distributer.
Version 2.0 of RSS added the ability to FIND the search criteria in the
source text database. RSS does not read the entire text database to display
the relevant text, but saves a pointer and length indicators in the dictionary
created during the build. RSS searchs are as fast as ever, and only the text
that meets the search criteria is extracted from the text database and
displayed.
*******************************************************************************
* Tutorial *
*******************************************************************************
In order to help you learn the power and features of the RSS package we have
included a set of files that builds and queries Ralf Browns Interrupt List.
The Interrupt List is a huge collection of PC interrupts that is available
free on the internet, and probably elsewhere. In order to follow this tutorial
it is advised that you will get a hold of this package.
The first step in using the RSS package is to create the dictionary on the
text database. In this example we will create a dictionary (index) called
intlist that will be used later to query the interrupt list.
The file DEMORSS.BDF is a RSS Build File that will allow RSS to scan the
interrupt list and build the dictionary.
The contents of this file are :
seperator ------
keyline 1
keypos 1
keylen 78
delimiter ' '
delimiter ','
delimiter '='
delimiter '"'
exclude int
exclude seealso:
exclude notes:
exclude -
exclude =
exclude return:
dictionary 8
When we will build the RSS dictionary - RSS will know that a new topic/entry
is found when a line that starts with the string ------ is found. (This is the
seperator command of the file).
The KEY to the topic is found on the first line after the seperator (keyline)
in column 1 (keypos 1) and has a length of 78 characters (keylen). Take a
look at the interrupt list to see the structure of the topics and verify that
this information is relevant.
The word delimiters are a space (delimiter ' '), and the following characters :
,=" .
The following words are assumed to exist in all topics : int, seealso:, notes:,
return: and = .
The last line of the build file specifies a dictionary of 8 segments.
To CREATE the dictionary use the following command :
RSS b @intlist $demorss interrup.?
This command instructs RSS to create a dictionary called intlist, using
the build file definitions found in the file DEMORSS.BDF, on all the
files that are called INTERRUP.? . (I do not combine the interrupt list, if
you do, replace interrup.? with interrup.lst or what ever you call it).
This operation will take some time, because the interrup list is a BIG
database.
If the build process is terminated because of memory constraints, try
removing some TSR's from memory. If nothing helps, you will have to
edit the DEMORSS.BDF file and change the dictionary line to specify a
smaller number of dictionary segments. (Each segment is about 64K, so
having 8 segments mean we need 512K just for the dictionary segments).
After the build process is finished RSS has created several new files
on the hard disk. These files are called INTLIST.*, where * has the
following meaning :
CFG - This is the configuration file. It specifies the key names, and
dictionary structure.
EXC - This is the exclude words dictionary segment.
DC? - Where ? is A, B .. are the dictionary data segments.
When ever we want to query the dictionary from now on we will use the
the following command :
RSS f @intlist our-query-here
If for example we would like to find all entries that has something to
do with lantastic network we will use a simple query such as :
RSS f @intlist lantastic
If we would like to find all refereneces to netbios, that do not include
references to netware we will use :
RSS f @intlist netbios and (not netware)
etc..
Notes :
1 - The Interrupt List is a BIG database with a lot of entries, the
8 dictionary segments are probably the minimal configuration
that will not cause too many collisions. If you have registered
RSS - it is advised that you use RSSX - The protected mode
version, and use a higher value (such as dictionary 16) to
achieve optimal query accuracy.
Example - I wanted to look at the real-time compression features added to
dos 6, so I performed the following search on version 3.4 of the
interrupt list :
rss f @intlist compress
and received the following answer :
RSS V3.0, ISoft D&M, P.O.B 5517, CORALVILLE IA 52241, U.S.A
Development Modification Level : 002c, Date : Apr., 22, 1993.
Query : compress
Searching intlist
File : INTERRUP.B, Key : INT 1A - MICROSOFT REAL-TIME COMPRESSION INTERFACE (MRCI) - ROM-BASED SERVER
INT 1A - Microsoft Real-Time Compression Interface (MRCI) - ROM-BASED SERVER
AX = B001h
CX = 4D52h ("MR")
DX = 4349h ("CI")
Return: CX = 4943h ("IC") if installed
DX = 524Dh ("RM") if installed
ES:DI -> MRCINFO structure (see below)
Note: this call is functionally identical to INT 2F/AX=4A12h, which should
be called first, as this call is used for the first, ROM-based
MRCI server, while the other call is used for RAM-based servers
which may be partially or entirely replacing a prior server
SeeAlso: INT 2F/AX=4A12h
Format of MRCINFO structure:
Offset Size Description
00h 4 BYTEs vendor signature
"MSFT" Microsoft
04h WORD server version (high=major)
06h WORD MRCI specification version
08h DWORD address of server entry point
0Ch WORD bit flags: server capabilities (see below)
0Eh WORD bit flags: hardware assisted capabilities (see below)
10h WORD maximum block size supported by server (at least 8192 bytes)
Bitfields for capabilities:
bit 0 standard compress
bit 1 standard decompress
bit 2 update compress
bit 3 MaxCompress
bit 4 reserved
bit 5 incremental decompress
bits 6-14 reserved
bit 15 this structure is in ROM and can't be modified
(server capabilities only)
Call MRCI entry point with:
DS:SI -> MRCREQUEST structure (see below)
CX = type of client (0000h application, 0001h file system)
AX = operation
0001h perform standard compression
0002h perform standard decompression
0004h perform update compression
0008h perform MaxCompress
0020h perform incremental decompression
AX = FFFFh clear flags
BX = bitmask of flags to clear (set bits in BX are flags to clear)
Return: AX = status
0000h successful
0001h invalid function
0002h server busy, try again
0003h destination buffer too small
0004h incompressible data
0005h bad compressed data format
Note: MRCI driver may chain to a previous driver
Format of MRCREQUEST structure:
Offset Size Description
00h DWORD pointer to source buffer
04h WORD size of source buffer (0000h = 64K)
06h WORD (UpdateCompress only)
(call) offset in source buffer of beginning of changed data
(return) offset in destination buffer of beginning of changed
compressed data
08h DWORD pointer to destination buffer
must contain original compressed data for UpdateCompress
0Ch WORD size of destination buffer (0000h = 64K)
any compression: size of buffer for compressed data
standard decompression: number of bytes to be decompressed
incremental decompression: number of byte to decompress now
(return) actual size of resulting data
0Eh WORD client compressed data storage allocation size
10h DWORD incremental decompression state data
set to 00000000h before first incremental decompression call
Notes: the source and destination buffers may not overlap
the source and destination buffer sizes should normally be the same
application should not update the contents of the MRCREQUEST structure
between incremental decompression calls
--------c-1AC0-------------------------------
Found 1 entry matches
*******************************************************************************
* Terminology *
*******************************************************************************
In order to understand RSS operation and customization we will define the
following terms :
SEPERATOR - a string prefix that is used to define the end of an entry, and the
begining of a new one.
KEYLINE - an integer that is used to define the number of lines below the
SEPERATOR line in the entry, the entry's key resides on.
KEYPOS - the position from the start of the line the key starts.
KEYLEN - the number of characters used to define the key.
DELIMITER - a character used to seperate between two words in a text database.
EXCLUDE - a word that appears on all of the entries in the database, and
we want to exclude from actual database.
*******************************************************************************
* Dictionary-Definition *
*******************************************************************************
In order to define the dictionary to RSS, the user must create a dictionary-
build-file-definition. This file has the following structure :
seperator string start looking at the line start
keyLine integer no. of lines below the seperator line where key is ..
keyPos integer key starts at pos on keyLine
keyLen integer keyLength
delimiter char character used as delimiter in the area ...
exclude word words to exclude
dictionary parm parm = number of dictionary segments to use.
Example :
seperator ======= seperator is a 7 = string at the begining of the line,
rest of line is ignored.
keyline 1 key is in first line below the seperator
keypos 12 key starts in column 12
keylen 5 and is 5 characters long
delimiter ' ' space is a delimiter
delimiter ',' , is a delimiter
delimiter '=' = is a delimiter
exclude he he will not be entered to dictionary
exclude prod prod -"- ...
dictionary 6 dictionary will use the 6 dict segments.
*******************************************************************************
* Operation *
*******************************************************************************
At the dos command line type RSS to get the following help screen :
RSS V3.0, ISoft D&M, P.O.B 5517, CORALVILLE IA 52241, U.S.A
Usage ..
RSS cmd cmd-parm
Where cmd Are ..
b - build database from files given in cmd-parm
s - search database for expression given in cmd-parm
f - find in database entries for cmd-parm expression
? - display help screen
Build format ..
RSS b [@dictionary-file] [$definition-file] files [files]
Search command ..
RSS s [@dictionary-file] logic-search-expr
Find command ..
RSS f [@dictionary-file] logic-search-expr
Notice ..
dictionary-file - with no suffix, will use .DCT and .CFG
definition-file - used to build dictionary.
IMPORTANT : DO NOT SPECIFY .BDF SUFFIX TO DEFINITION FILE!
*** Remarks :
When you enter file names to build in the build command, dos wild-cards are
allowed.
If you omit the dictionary-file, RSS will try to get the dictionary name
from the environment variable RSSDICT, if this name does not exist, RSS will
use the name 'RSSDICT' to build the dictionary configuration and index files.
If you omit the definition-file, RSS will try to get the definition name
from the environment variable RSSBLDF, if this name does not exist, RSS will
use the name 'RSSBLDF' to read the build configuration parameters.
PLEASE NOTICE - dictionary and definition file parameters given MUST NOT
include any file name extention.
(e.g. RIGHT -> rss b @dict1 $def *.*
WRONG -> rss b @dict1 $def.bdf *.* )
The logic search expressions understood by RSS have the following format :
[NOT] search-word-1 [AND | OR | XOR [NOT] search-word-2 [AND | ...]]
Some examples will clarify the definition :
RON and not landmark - will print all of the files that
contain the word RON, but do not
contain the word LANDMARK.
JOG or DIE - will print all the files that
contain the word JOG, or the
word DIE, or both of them.
JOG xor DIE - All of the files that contain
either one of the words DIE or JOG,
but not both of them.
Some points to consider :
RSS does not make a difference between upper and lower case letters. -
lanDmark, LANDMARK, landMArK and landmark are all the same.
RSS definition for a word - any set of characters seperated by delimiters.
Operator precedence : NOT, AND, XOR, OR.
*******************************************************************************
* Warranty *
*******************************************************************************
There is no warranty what so ever, The program is supplied as is,
The distributer (ISoft D&M), or the author (Loewy Ron), are not,
and will not be responsible for any damages, lost profits,
or inconveniences caused by the use, or inability to use this program.
The use of the program is at your own risk.
By using (or attempting to use) the program you agree to this.
*******************************************************************************
* Distribution *
*******************************************************************************
RSS is distributed by ISoft D&M, P.O.B. 5517 CORALVILLE IA 52241, U.S.A.
RSS is (c) copyrighted by Loewy Ron, 1991, 93.
RSS is a shareware program, please register your copy.
To register your copy of RPTP please refer to the supplied
RSS.REG file.
Other programs distributed by ISoft D&M are described in the supplied
PROGRAMS.TXT file.
*******************************************************************************
* Contact *
*******************************************************************************
Please contact :
ISoft D&M,
P.O.B 5517
Coralville IA 52241,
U.S.A
To contact the author directly :
Contact : Loewy Ron,
9 Haneveem st.
Herzeliya, 46465
ISRAEL.
e-mail address : CompuServe - 100274,162
*******************************************************************************
* Credits *
*******************************************************************************
RSS was written using Turbo Pascal 6.0, and Borland Pascal 7.0.
(Trademarks of Borland International).
4DOS is a copyright of J.P. software.
E.T. Floyd wrote the DDJ published DICT unit, I used the ideas in this unit
to create RSS, and was helped by the published source code, but My dictionary
uses different hash algorithm. ( I saw that in DDJ Jan. 1991 MR. Floyd
answered a letter regarding the hash algorithm. ), I removed some of the
code I did not need from the DICT unit, and added the ability to remove
keys from the dictionary.
The interrupt list compilation is a Copyright 1989, 93 of Ralf Brown.
MS DOS is a trademark of Microsoft Corporation.